-
Notifications
You must be signed in to change notification settings - Fork 68
🌱 Use Patch instead of Update for finalizer operations #2328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
The ClusterExtension and ClusterCatalog finalizer code used Update, the ClusterExtensionRevision used Apply, so the CE and CE finalizer code was rewritten to be like CER. There is common finalizer code between them now. |
|
Ping @joelanford |
|
catalogd pod is panicing... it only fails due to the summary generated, it's not "noticeable" during a regular run! |
If we want to remove conflicts, we should use patch, but adding/removing finalizer we do not touch removed fields (including the annotation). Controller-runtime cache is write-through cache, so an update or patch operation does not perform any direct modification on the cache - all changes are made by the informer. In what situations did we see conflicts, given that only single controller is responsible for a resource type? |
This is not a conflict issue. The problem was that we are using Update(), which uses the cached version of the resource to make the changes. The cached version no longer includes the This causes a problem with e.g. kubectl applying a ClusterExtension a second time. If you apply a CE twice without this code (i.e. current The problem is that the |
21e90d9 to
0608a72
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2328 +/- ##
==========================================
- Coverage 74.45% 74.36% -0.09%
==========================================
Files 93 94 +1
Lines 7300 7346 +46
==========================================
+ Hits 5435 5463 +28
- Misses 1433 1441 +8
- Partials 432 442 +10
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| return false, fmt.Errorf("marshalling patch to add finalizer: %w", err) | ||
| } | ||
| // Note: Patch will update obj with the server response, including ResourceVersion | ||
| if err := c.Patch(ctx, obj, client.RawPatch(types.MergePatchType, patchJSON)); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use SSA here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that could be a good improvement, that would simplify the reconcile logic a lot, i.e. we do not need to care about appending/sorting finalizers at all.
Thanks for explaining it, make sense. Given that I think we should update PR description to reflect that, currently we have:
Given the explanation above, we are not reducing conflict or improving performances - we are ensuring correctness, i.e. not removing the annotation. |
cf27d6e to
bfb7a16
Compare
internal/catalogd/controllers/core/clustercatalog_controller.go
Outdated
Show resolved
Hide resolved
internal/catalogd/controllers/core/clustercatalog_controller.go
Outdated
Show resolved
Hide resolved
internal/catalogd/controllers/core/clustercatalog_controller.go
Outdated
Show resolved
Hide resolved
| Applier Applier | ||
| RevisionStatesGetter RevisionStatesGetter | ||
| Finalizers crfinalizer.Finalizers | ||
| FinalizerHandlers map[string]FinalizerHandler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a controller should set single finalizer, more than one is strange. I would suggest reverting the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is more than one for CEs right now because there are multiple finalizer concerns:
- Cleaning up the unpacked bundle image directory
- Shutting down the content manager informers (helm-only)
I'm not sure how the current helm-to-boxcutter migration deals with the content manager informer shutdown. Do those informers get cleaned up when boxcutter takes over or when the CE is deleted (potentially much later). Ideally we'd shutdown the helm-only content manager informers as soon as boxcutter takes over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(but that's getting a bit outside the scope of this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, there ARE two pre-existing finalizers on the ClusterExtension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how the current helm-to-boxcutter migration deals with the content manager informer shutdown. Do those informers get cleaned up when boxcutter takes over or when the CE is deleted (potentially much later). Ideally we'd shutdown the helm-only content manager informers as soon as boxcutter takes over.
In order to switch from the Helm applier to the Boxcutter applier, the pod has to restart, so everything is effectively shutdown. The content-manager finalizer is a no-op under Boxcutter, but the finalizer still needs to be removed.
| u.SetGroupVersionKind(gvk) | ||
| u.SetName(obj.GetName()) | ||
| u.SetNamespace(obj.GetNamespace()) | ||
| u.SetFinalizers(newFinalizers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it has to be u.SetFinalizers(finalizers)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. this is based on the assumption that the arguments to EnsureFinalizers represent the only finalizers we want in the list. This is not the intent.
The EnsureFinalizers API ensures that the given finalizers are present. The intent is not to set the finalizers to only be those given, and is not intended to remove finalizers.
Because this is a list, we can't patch individual items into the list, we need to specify the whole list (at least that's how the SSA API is working). I verified this in the code by calling EnsureFinalizers() separately for each finalizer string, rather than in bulk. Only the last finalizer showed up when it was u.SetFinalizers(finalizers). Both finalizers showed up when the code was as above.
Yes, I realize the name EnsureFinalizers is a bit confusing, and should probably be AddFinalizers, but I was following the naming scheme (and behavior) of the removed code, which was inconsistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is a list, we can't patch individual items into the list, we need to specify the whole list (at least that's how the SSA API is working).
I need to disagree here, because Finalizer is defined as a set with merge patch stratefy:
https://github.com/kubernetes/apimachinery/blob/master/pkg/apis/meta/v1/types.go#L256-L259
Therefore, we do not patch item by item, we set all finalizers we need to own and make SSA requests.
BTW, my understanding here is that finalizers argument contains all finalizers we would like to set, not just one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Therefore, we do not patch item by item, we set all finalizers we need to own and make SSA requests.
Agreed, that's how SSA works, but that's not how the API layer above it has to work.
The function as defined adds finalizers, that's the intent. The intent is not to make the finalizer list equal to the arguments.
I'm renaming the function to indicate the desired (existing) behavior. It is confusing to have the mechanism to remove finalizers as "ensuring nothing".
If we wanted the behavior as you suggest, then SetFinalizers() would be more appropriate. But it would be at the cost of flexibility, and potential issues if other controllers add their own finalizers.
| } | ||
|
|
||
| // Update the passed object with the new finalizers | ||
| obj.SetFinalizers(newFinalizers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it has to get `obj.SetFinalizers(u.GetFinalizers())
|
|
||
| // RemoveFinalizer removes one or more finalizers from the object using server-side apply. | ||
| // If none of the finalizers exist, this is a no-op. | ||
| func RemoveFinalizer(ctx context.Context, c client.Client, obj client.Object, finalizers ...string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to have this function because EnsureFinalizer(ctx, c, obj) is going to remove previously added finalizers that we owned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent of EnsureFinalizers is to only add finalizers to the list of finalizers, not to remove them. The RemoveFinalizers removes finalizers individually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in SSA approach, you are not remove things by omitting them in request, if you owned them previously. Hence, if we had somewhere in the code:
EnsureFinalizers(ctx, client, ce, "foo", "bar")to remove bar, we can perform:
EnsureFinalizers(ctx, client, ce, "foo")There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that is not the intent of the API; it's to add items individually; not to remove. Just because the underlying mechanism is SSA, doesn't mean the layered API has to behave that way.
IMHO, It's confusing to have "Ensure" remove items, "Ensure" means to check that something is true or correct. In this case "Ensure" is confusing, so I'm renaming the function.
|
EDIT: All our finalizers use the same prefix, other controllers' finalizers should use a different prefix. So, now that I thought about it some more, I think a single |
bfa9322 to
9dea235
Compare
|
@tmshort apologize I did not mention it earlier: moving to SSA requires also to migrate the information stored under In our case, it is easy to reproduce:
- apiVersion: olm.operatorframework.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.: {}
v:"olm.operatorframework.io/cleanup-contentmanager-cache": {}
v:"olm.operatorframework.io/cleanup-unpack-cache": {}
manager: operator-controller
operation: Update This is expected, because we have added finalizers via
To fix that we would need to migrate the ownership under Instead of going to through SSA migration, I would suggest that we revert removal of that annotation as the quick fix, and then introduce SSA if needed + detailed testing. |
|
/hold That being said, there is currently a failure in the test-experimental-e2e, which seems to be triggered by this change; since I'm not seeing the issue on main. I would need to fix that before this could be merged. I will point out that managed fields is not a "bit of memory", but can double the size of the cache. The memory savings was significant. |
I agree, but managed fields are readded in the cache with #2318 |
Whoops, I meant last-applied-config. |
|
/unhold |
9fc4525 to
6e95c7b
Compare
6c820a5 to
8fba829
Compare
|
Reworked (i.e. rebased to |
8fba829 to
e2b3cda
Compare
| // Use patch to update finalizers on the server | ||
| patch := map[string]any{ | ||
| "metadata": map[string]any{ | ||
| "finalizers": finalizers, | ||
| }, | ||
| } | ||
| patchJSON, err := json.Marshal(patch) | ||
| if err != nil { | ||
| reconcileErr = errors.Join(reconcileErr, fmt.Errorf("marshalling patch to ensure finalizers: %w", err)) | ||
| } else if err := r.Patch(ctx, reconciledCatsrc, client.RawPatch(types.MergePatchType, patchJSON)); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about that we use already available controller-runtime's CreateOrPatch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly overkill (looking at the size of the function), but I can look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly overkill (looking at the size of the function), but I can look.
I would create RawPatch, only if really needed. CreateOrPatch() covers us both in updating finalizers and status and could simplify the code a lot, i.e. we do not need to check if statuses/finalizers are changed during the reconcile, that function does it all. The whole updateFinalizers { ... } block could be replaced with something like:
if _, err := controllerutil.CreateOrPatch(ctx, r.Client, &existingCatsrc, func() error {
existingCatsrc.Finalizers = reconciledCatsrc.Finalizers
return nil
}); err != nil {
return ctrl.Result{}, err
}and we could also go beyond that (slightly out of the PR scope, could be done separately) and also to drop if updateStatus {...} block with:
if _, err := controllerutil.CreateOrPatch(ctx, r.Client, &existingCatsrc, func() error {
existingCatsrc.Finalizers = reconciledCatsrc.Finalizers
existingCatsrc.Status = reconciledCatsrc.Status
return nil
}); err != nil {
return ctrl.Result{}, err
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly overkill (looking at the size of the function), but I can look.
As in the size of CreateOrPatch(), that is...
Dammit, I forgot to push my changes last night,...
| // Use patch to update finalizers on the server | ||
| patch := map[string]any{ | ||
| "metadata": map[string]any{ | ||
| "finalizers": finalizers, | ||
| }, | ||
| } | ||
| patchJSON, err := json.Marshal(patch) | ||
| if err != nil { | ||
| reconcileErr = errors.Join(reconcileErr, fmt.Errorf("marshalling patch to ensure finalizers: %w", err)) | ||
| } else if err := r.Patch(ctx, reconciledExt, client.RawPatch(types.MergePatchType, patchJSON)); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question as above.
internal/operator-controller/controllers/clusterextension_controller.go
Outdated
Show resolved
Hide resolved
…mework#2338)" This reverts commit 39cbdbe.
Refactor all controllers to use Patch() instead of Update() when adding or removing finalizers to improve performance, and to avoid removing non-cached fields erroneously. This is necesary because we no longer cache the last-applied-configuration annotation, so when we add/remove the finalizers, we are removing that field from the metadata. This causes issues with clients when they don't see that annotation (e.g. apply the same ClusterExtension twice). Update ClusterCatalog and ClusterExtension controllers to use Patch-based funalizer management (ClusterExtensionRevision already uses Patch()) Signed-off-by: Todd Short <[email protected]>
e2b3cda to
97d5a33
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a fan of this package name, but I couldn't think of anything better? Suggestions welcome.
|
|
||
| // GetSpec returns the Spec field of the ClusterExtension. | ||
| // This method is provided to satisfy generic interfaces that require a GetSpec() method. | ||
| func (c *ClusterExtension) GetSpec() ClusterExtensionSpec { | ||
| return c.Spec | ||
| } | ||
|
|
||
| // GetSpec returns the Spec field of the ClusterCatalog. | ||
| // This method is provided to satisfy generic interfaces that require a GetSpec() method. | ||
| func (c *ClusterCatalog) GetSpec() ClusterCatalogSpec { | ||
| return c.Spec | ||
| } | ||
|
|
||
| // GetSpec returns the Spec field of the ClusterExtensionRevision. | ||
| // This method is provided to satisfy generic interfaces that require a GetSpec() method. | ||
| func (c *ClusterExtensionRevision) GetSpec() ClusterExtensionRevisionSpec { | ||
| return c.Spec | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is recommended to avoid adding helper functions:
https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md#no-functions
| type ObjectWithSpec[T any] interface { | ||
| GetAnnotations() map[string]string | ||
| GetLabels() map[string]string | ||
| GetSpec() T | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about that we do not introduce this interface and get access to .Spec via reflection api? with that CheckForUnexpectedFieldChange signature could become just:
func CheckForUnexpectedFieldChange[T metav1.Object](a, b T) booland we do not need to implement any additional interface on our API.
Refactor all controllers to use Patch() instead of Update()
when adding or removing finalizers to improve performance, and to avoid
removing non-cached fields erroneously.
This is necesary because we no longer cache the last-applied-configuration
annotation, so when we add/remove the finalizers, we are removing that field
from the metadata. This causes issues with clients when they don't see that
annotation (e.g. apply the same ClusterExtension twice).
Update ClusterCatalog and ClusterExtension controllers to use Patch-based
finalizer management (ClusterExtensionRevision already uses Patch())
This also restores the non-caching of "last-applied-config".
Description
Reviewer Checklist